1,636 research outputs found
TopCom: Index for Shortest Distance Query in Directed Graph
Finding shortest distance between two vertices in a graph is an important
problem due to its numerous applications in diverse domains, including
geo-spatial databases, social network analysis, and information retrieval.
Classical algorithms (such as, Dijkstra) solve this problem in polynomial time,
but these algorithms cannot provide real-time response for a large number of
bursty queries on a large graph. So, indexing based solutions that pre-process
the graph for efficiently answering (exactly or approximately) a large number
of distance queries in real-time is becoming increasingly popular. Existing
solutions have varying performance in terms of index size, index building time,
query time, and accuracy. In this work, we propose T OP C OM , a novel
indexing-based solution for exactly answering distance queries. Our experiments
with two of the existing state-of-the-art methods (IS-Label and TreeMap) show
the superiority of T OP C OM over these two methods considering scalability and
query time. Besides, indexing of T OP C OM exploits the DAG (directed acyclic
graph) structure in the graph, which makes it significantly faster than the
existing methods if the SCCs (strongly connected component) of the input graph
are relatively small
FS^3: A Sampling based method for top-k Frequent Subgraph Mining
Mining labeled subgraph is a popular research task in data mining because of
its potential application in many different scientific domains. All the
existing methods for this task explicitly or implicitly solve the subgraph
isomorphism task which is computationally expensive, so they suffer from the
lack of scalability problem when the graphs in the input database are large. In
this work, we propose FS^3, which is a sampling based method. It mines a small
collection of subgraphs that are most frequent in the probabilistic sense. FS^3
performs a Markov Chain Monte Carlo (MCMC) sampling over the space of a
fixed-size subgraphs such that the potentially frequent subgraphs are sampled
more often. Besides, FS^3 is equipped with an innovative queue manager. It
stores the sampled subgraph in a finite queue over the course of mining in such
a manner that the top-k positions in the queue contain the most frequent
subgraphs. Our experiments on database of large graphs show that FS^3 is
efficient, and it obtains subgraphs that are the most frequent amongst the
subgraphs of a given size
Con-S2V: A Generic Framework for Incorporating Extra-Sentential Context into Sen2Vec
We present a novel approach to learn distributed representation of sentences from unlabeled data by modeling both content and context of a sentence. The content model learns sentence representation by predicting its words. On the other hand, the context model comprises a neighbor prediction component and a regularizer to model distributional and proximity hypotheses, respectively. We propose an online algorithm to train the model components jointly. We evaluate the models in a setup, where contextual information is available. The experimental results on tasks involving classification, clustering, and ranking of sentences show that our model outperforms the best existing models by a wide margin across multiple datasets
Name Disambiguation from link data in a collaboration graph using temporal and topological features
In a social community, multiple persons may share the same name, phone number
or some other identifying attributes. This, along with other phenomena, such as
name abbreviation, name misspelling, and human error leads to erroneous
aggregation of records of multiple persons under a single reference. Such
mistakes affect the performance of document retrieval, web search, database
integration, and more importantly, improper attribution of credit (or blame).
The task of entity disambiguation partitions the records belonging to multiple
persons with the objective that each decomposed partition is composed of
records of a unique person. Existing solutions to this task use either
biographical attributes, or auxiliary features that are collected from external
sources, such as Wikipedia. However, for many scenarios, such auxiliary
features are not available, or they are costly to obtain. Besides, the attempt
of collecting biographical or external data sustains the risk of privacy
violation. In this work, we propose a method for solving entity disambiguation
task from link information obtained from a collaboration network. Our method is
non-intrusive of privacy as it uses only the time-stamped graph topology of an
anonymized network. Experimental results on two real-life academic
collaboration networks show that the proposed method has satisfactory
performance.Comment: The short version of this paper has been accepted to ASONAM 201
Incremental eigenpair computation for graph Laplacian matrices: theory and applications
The smallest eigenvalues and the associated eigenvectors (i.e., eigenpairs) of a graph Laplacian matrix have been widely used for spectral clustering and community detection. However, in real-life applications, the number of clusters or communities (say, K) is generally unknown a priori. Consequently, the majority of the existing methods either choose K heuristically or they repeat the clustering method with different choices of K and accept the best clustering result. The first option, more often, yields suboptimal result, while the second option is computationally expensive. In this work, we propose an incremental method for constructing the eigenspectrum of the graph Laplacian matrix. This method leverages the eigenstructure of graph Laplacian matrix to obtain the Kth smallest eigenpair of the Laplacian matrix given a collection of all previously compute
Honey bee foraging: persistence to non-rewarding feeding locations and waggle dance communication
The honey bee, Apis mellifera, is important in agriculture and also as a model species in scientific research. This Master’s thesis is focused on honey bee foraging behaviour. It contains two independent experiments, each on a different subject within the area of foraging. Both use a behavioural ecology approach, with one investigating foraging behaviour and the other foraging communication. These form chapters 2 and 3 of the thesis, after an introductory chapter.
Chapter 2. Experiment 1: Persistence to unrewarding feeding locations by forager honey bees (Apis mellifera): the effects of experience, resource profitability, and season
This study shows that the persistence of honey bee foragers to unrewarding food sources, measured both in duration and number of visits, was greater to locations that previously offered sucrose solution of higher concentration (2 versus 1molar) or were closer to the hive (20 versus 450m). Persistence was also greater in bees which had longer access at the feeder before the syrup was terminated (2 versus 0.5h). These results indicate that persistence is greater for more rewarding locations. However, persistence was not higher in the season of lowest nectar availability in the environment.
Chapter 3. Experiment 2: Honey bee waggle dance communication: signal meaning and signal noise affect dance follower behaviour
This study shows that honey bee foragers follow fewer waggle runs as the distance to the food source, that is advertised by the dance, increases, but invest more time in following these dances. This is because waggle run duration increases with increasing foraging distance. The number of waggle runs followed for distant food sources was further reduced by increased angular noise among waggle runs within a dance. The number of dance followers per dancing bee was affected by the time of year and varied among colonies. Both noise in the message, that is variation in the direction component, and the message itself, that is the distance of the advertised food location, affect dance following. These results indicate that dance followers pay attention to the costs and benefits associated with using dance information
The Effectiveness of Educational Games on Scientific Concepts Acquisition in First Grade Students in Science
This study aimed at investigating the effectiveness of educational games on scientific concepts acquisition by the first grade students. The sample of the study consisted of (53) male and female students distributed into two groups: experimental group (n=26) which taught by educational games, and control group (n=27) which taught by traditional method. To achieve the purpose of the study, the researcher developed a teaching guide included eight educational games, and a test to measure scientific concepts acquisition. Results showed that there were statistically significant differences in students’ scientific concepts acquisition due to the method of teaching in favor of the experimental group. Also, there were no statistically significant differences in students’ scientific concepts acquisition due to the gender or the interaction between method of teaching and gender. The study recommended using educational games in teaching science in primary education. Keywords: Educational Games, Scientific Concepts, Science
The degree of career polarization among educational leaders in the Jordanian Education Directorates
The study aimed to identify the degree of career polarization among educational leaders in the Jordanian–education directorates of Ajloun and Jersah. The researchers adopted the descriptive -analytical approach for its suitability for such studies. The researchers used the questionnaire as the study instrument, which comprised 20 items. The researchers distributed the items in two domains, 10 items were for each domain as the study instrument. The sample of the study comprised 250 educational leaders for the first semester of the academic year 2019-2020. The study results showed that the degree of career polarization among educational leaders in the Jordanian Ministry of Education came with an average degree of rating in all its domains and for all items. The results also showed that there were no statistically significant differences at the level of statistical significance (α=0.05) attributed to the two study variables. Gender and the number of years of experience are the two study variables
Name Disambiguation in Anonymized Graphs using Network Embedding
In real-world, our DNA is unique but many people share names. This phenomenon often causes erroneous aggregation of documents of multiple persons who are namesake of one another. Such mistakes deteriorate the performance of document retrieval, web search, and more seriously, cause improper attribution of credit or blame in digital forensic. To resolve this issue, the name disambiguation task is designed which aims to partition the documents associated with a name reference such that each partition contains documents pertaining to a unique real-life person. Existing solutions to this task substantially rely on feature engineering, such as biographical feature extraction, or construction of auxiliary features from Wikipedia. However, for many scenarios, such features may be costly to obtain or unavailable due to the risk of privacy violation. In this work, we propose a novel name disambiguation method. Our proposed method is non-intrusive of privacy because instead of using attributes pertaining to a real-life person, our method leverages only relational data in the form of anonymized graphs. In the methodological aspect, the proposed method uses a novel representation learning model to embed each document in a low dimensional vector space where name disambiguation can be solved by a hierarchical agglomerative clustering algorithm. Our experimental results demonstrate that the proposed method is significantly better than the existing name disambiguation methods working in a similar setting
- …